Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gh-128198: Add missing error checks for usages of PyIter_Next #128199

Merged
merged 7 commits into from
Dec 25, 2024

Conversation

WolframAlph
Copy link
Contributor

@WolframAlph WolframAlph commented Dec 23, 2024

@WolframAlph WolframAlph force-pushed the WolframAlph/fix-pyiternext branch from ca54407 to c6d319f Compare December 23, 2024 16:33
@rhettinger rhettinger removed their request for review December 23, 2024 18:01
Copy link
Member

@ZeroIntensity ZeroIntensity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In 3.14, we have a new PyIter_NextItem for exactly this problem. I think it's worth migrating to that in this PR instead of trying to adapt to the bad design choices of PyIter_Next.

@WolframAlph
Copy link
Contributor Author

👍 Will check it out and update the PR

@WolframAlph
Copy link
Contributor Author

@ZeroIntensity I can see PyIter_NextItem being used only in couple places. Should I rename the issue to indicate migration from old PyIter_Next to new PyIter_NextItem api and work on replacing old one in the entire codebase?

@ZeroIntensity
Copy link
Member

I don't think we need to replace PyIter_Next throughout the entire codebase (though it is probably inevitable), just change it for the cases that are problematic.

Should I rename the issue to indicate migration from old PyIter_Next

Nah, just update the PR title.

@WolframAlph
Copy link
Contributor Author

WolframAlph commented Dec 24, 2024

@ZeroIntensity so if this is inevitable (otherwise why would new PyIter_NextItem be implemented) and I can see ~63 instances of PyIter_Next usage in the codebase, I would like to spend time on migrating to the new version in this PR. Or at least those instances that do not require significant rewrite and don't introduce more complexity. Any objections from your side? Or if this is still too much for a single PR, maybe create top level issue to track migration and add separate issues for migration in each file where usages occur?

Modules/_collectionsmodule.c Outdated Show resolved Hide resolved
Modules/_collectionsmodule.c Outdated Show resolved Hide resolved
Modules/_asynciomodule.c Show resolved Hide resolved
@picnixz picnixz changed the title gh-128198: Add missing error checks for PyIter_Next gh-128198: Add missing error checks for usages of PyIter_Next Dec 24, 2024
@serhiy-storchaka
Copy link
Member

If older Python versions have the same code, it will be needed to backport these changes. In such case it is better to use the older API, unless you want to add a work for yourself and for reviewers.

@ZeroIntensity
Copy link
Member

Ah, right. I would be in favor of a follow-up PR to switch to PyIter_NextItem though.

@WolframAlph WolframAlph requested a review from picnixz December 25, 2024 09:41
Copy link
Contributor

@picnixz picnixz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (I'll just run the refleaks build bots since the previous run was interrupted, just in case)

@picnixz picnixz added the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Dec 25, 2024
@bedevere-bot
Copy link

🤖 New build scheduled with the buildbot fleet by @picnixz for commit 3aa46e7 🤖

If you want to schedule another build, you need to add the 🔨 test-with-refleak-buildbots label again.

@bedevere-bot bedevere-bot removed the 🔨 test-with-refleak-buildbots Test PR w/ refleak buildbots; report in status section label Dec 25, 2024
@WolframAlph
Copy link
Contributor Author

Regarding migration to PyIter_NextItem, can I create top level issue and tag you guys? @serhiy-storchaka @picnixz @ZeroIntensity ?

@picnixz
Copy link
Contributor

picnixz commented Dec 25, 2024

Yes, you can just request my review. But before creating the issue, you should locally verify that:

  • the changes would not impact performances (I don't know which one is faster); try to check if PyIter_Next is actually faster overall or if it's PyIter_NextItem).
  • some parts of the code directly use tp_iternext under some assumptions and for performance reasons. Those parts should not be modified.
  • locally investigate how you want to split the task. Changing a lot of places may or may not be easy to review. There are more than 50 usages my IDE found so maybe splitting it up by file would be better.
  • let's first focus on parts that don't need much rewrite; it's fine if some parts are still using the old API though.

FTR, changing the usages could perhaps make future backports harder, so I don't know whether it's worth the shot.

@WolframAlph
Copy link
Contributor Author

WolframAlph commented Dec 25, 2024

locally investigate how you want to split the task. Changing a lot of places may or may not be easy to review. There are more than 50 usages my IDE found so maybe splitting it up by file would be better.

Yes, my initial thought was to create top level issue to track all sub issues for each file separately to make review process easier.

FTR, changing the usages could perhaps make future backports harder, so I don't know whether it's worth the shot.

I presume this is not first time of migrating to newer API. How were those cases handled?

@WolframAlph
Copy link
Contributor Author

One more question. When comparing code for PyIter_Next:

cpython/Objects/abstract.c

Lines 2914 to 2919 in d9ed42b

PyIter_Next(PyObject *iter)
{
PyObject *item;
(void)iternext(iter, &item);
return item;
}

and PyIter_NextItem:

cpython/Objects/abstract.c

Lines 2891 to 2903 in d9ed42b

PyIter_NextItem(PyObject *iter, PyObject **item)
{
assert(iter != NULL);
assert(item != NULL);
if (Py_TYPE(iter)->tp_iternext == NULL) {
*item = NULL;
PyErr_Format(PyExc_TypeError, "expected an iterator, got '%T'", iter);
return -1;
}
return iternext(iter, item);
}

I can see that latter has Py_TYPE(iter)->tp_iternext == NULL check while former does not. Is it intentional? If so - what is the reason?

@picnixz
Copy link
Contributor

picnixz commented Dec 25, 2024

Yes it's intentional. Docs (https://docs.python.org/3/c-api/iter.html#c.PyIter_Next) say:

The object must be an iterator according to PyIter_Check() (it is up to the caller to check this).

The reason is to make the call more efficient when the caller knows that the object is a correct iterator. OTOH PyIter_NextItem makes the check for the caller (if Py_TYPE is a function call, this would add one function call and one pointer comparison; on incorrect inputs, we don't care about performances).

I don't know how less (or more?) efficient it would be but it might not necessarily by much. If you're really interested in preserving performance, we could first glance at the assembly and benchmarks some simple statements, though I suspect that naive benchmarks would hide too much noise (but maybe not?).

Now, if the caller is checking the input to PyIter_Next, then you can maybe remove that check and directly use PyIter_NextItem instead (in the end, you'll move the cost of the check to PyIter_NextItem, though it will cost a function call (I assume that the checks are actually inlined checks but maybe not)).

I presume this is not first time of migrating to newer API. How were those cases handled?

I don't know, I only became active with CPython in June :')

@serhiy-storchaka
Copy link
Member

I presume this is not first time of migrating to newer API. How were those cases handled?

We migrate to newer API if it is faster, safer or more convenient. Or if there are plans to remove the older API.

There are no plans to remove PyIter_Next(), it is not broken. In theory, it may be marginally faster because it does not need to call PyErr_Occurred(), but on other hand, additional check Py_TYPE(iter)->tp_iternext == NULL can make it slower.

I would recommend to use a newer API in a new code if it makes the code clearer, and do not touch existing code (unless you have other reasons to rewrite it).

@serhiy-storchaka serhiy-storchaka merged commit 5c814c8 into python:main Dec 25, 2024
59 checks passed
@serhiy-storchaka serhiy-storchaka added needs backport to 3.12 bug and security fixes needs backport to 3.13 bugs and security fixes labels Dec 25, 2024
@miss-islington-app
Copy link

Thanks @WolframAlph for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.13.
🐍🍒⛏🤖 I'm not a witch! I'm not a witch!

@miss-islington-app
Copy link

Thanks @WolframAlph for the PR, and @serhiy-storchaka for merging it 🌮🎉.. I'm working now to backport this PR to: 3.12.
🐍🍒⛏🤖

@miss-islington-app
Copy link

Sorry, @WolframAlph and @serhiy-storchaka, I could not cleanly backport this to 3.13 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 5c814c83cdd3dc42bd9682106ffb7ade7ce6b5b3 3.13

@miss-islington-app
Copy link

Sorry, @WolframAlph and @serhiy-storchaka, I could not cleanly backport this to 3.12 due to a conflict.
Please backport using cherry_picker on command line.

cherry_picker 5c814c83cdd3dc42bd9682106ffb7ade7ce6b5b3 3.12

serhiy-storchaka pushed a commit to serhiy-storchaka/cpython that referenced this pull request Dec 26, 2024
…_Next() (pythonGH-128199)

(cherry picked from commit 5c814c8)

Co-authored-by: Yan Yanchii <yyanchiy@gmail.com>
@bedevere-app
Copy link

bedevere-app bot commented Dec 26, 2024

GH-128272 is a backport of this pull request to the 3.13 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.13 bugs and security fixes label Dec 26, 2024
serhiy-storchaka pushed a commit to serhiy-storchaka/cpython that referenced this pull request Dec 26, 2024
…_Next() (pythonGH-128199)

(cherry picked from commit 5c814c8)

Co-authored-by: Yan Yanchii <yyanchiy@gmail.com>
@bedevere-app
Copy link

bedevere-app bot commented Dec 26, 2024

GH-128273 is a backport of this pull request to the 3.12 branch.

@bedevere-app bedevere-app bot removed the needs backport to 3.12 bug and security fixes label Dec 26, 2024
serhiy-storchaka added a commit that referenced this pull request Dec 26, 2024
GH-128199) (GH-128273)

(cherry picked from commit 5c814c8)

Co-authored-by: Yan Yanchii <yyanchiy@gmail.com>
serhiy-storchaka added a commit that referenced this pull request Dec 26, 2024
GH-128199) (GH-128272)

(cherry picked from commit 5c814c8)

Co-authored-by: Yan Yanchii <yyanchiy@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants